-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-10246. Remove KeyValueHandler.checkContainerIsHealthy to improve read performance #6127
Conversation
@ChenSammi PTAL, Thanks ! |
} finally { | ||
kvContainer.readUnlock(); | ||
// No kvContainer.readLock() for performance optimization | ||
if (kvContainer.getContainerData().getState() == State.UNHEALTHY) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you seeing performance problems when the container is unhealthy, along with a lot of log messages?
This lock is on a container, not "all containers at the same time", so do you also have a very high volume of requests for the same block?
At the very least, we should not log under a lock, as logging can be slow, but I would like to understand more about the problem before just removing the lock. On first look, it does not seem like it should be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 139 to 143 in bd1cf68
private ContainerBackgroundTaskResult handleDeleteTask() throws Exception { | |
ContainerBackgroundTaskResult crr; | |
final Container container = ozoneContainer.getContainerSet() | |
.getContainer(containerData.getContainerID()); | |
container.writeLock(); |
BlockDeletingTask#handleDeleteTask
will hold KeyValueContainer.writeLock()
, and sometimes it takes a long time, as follow logs:
2024-01-04 16:30:16,752 [BlockDeletingService#8] INFO org.apache.hadoop.ozone.container.keyvalue.statemachine.background.BlockDeletingTask: Max lock hold time (1000 ms) reached after 1390 ms. Completed 1 transactions, deleted 0 blocks. In container: 5455053. Releasing lock and resuming deletion later.
2024-01-04 17:01:16,897 [BlockDeletingService#1] INFO org.apache.hadoop.ozone.container.keyvalue.statemachine.background.BlockDeletingTask: Max lock hold time (1000 ms) reached after 1292 ms. Completed 1 transactions, deleted 0 blocks. In container: 5458979. Releasing lock and resuming deletion later.
This will result in a long waiting when the data in this container is read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sodonnel , you can find the detail info under HDDS-10146. The scenario is when there are a lot of block deletions happening, read other blocks in the same container as being deleted block will be blocked due to unable to query container's read lock. The lock is hold by the thread doing the block deletion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think removing this lock will help much because:
isContainerHealthy()
calls kvContainer.getContainerData().getState()
which looks like:
public synchronized ContainerDataProto.State getState() {
return state;
}
Which is actually worse than the read lock, as the synchronized method prevents even concurrent readers. Even after removing this read lock, you will need to deal with the synchronized method. There are quite a few methods on that class which are synchronized.
All this method is doing is logging a message if the container is unhealthy - do we need to do that? The method is called in only 4 places, always in this sort of pattern:
try {
BlockID blockID = BlockID.getFromProtobuf(
request.getGetBlock().getBlockID());
checkContainerIsHealthy(kvContainer, blockID, Type.GetBlock);
responseData = blockManager.getBlock(kvContainer, blockID)
.getProtoBufMessage();
final long numBytes = responseData.getSerializedSize();
metrics.incContainerBytesStats(Type.GetBlock, numBytes);
} catch (StorageContainerException ex) {
return ContainerUtils.logAndReturnError(LOG, ex, request);
} catch (IOException ex) {
return ContainerUtils.logAndReturnError(LOG,
new StorageContainerException("Get Key failed", ex, IO_EXCEPTION),
request);
}
return getBlockDataResponse(request, responseData);
}
If the container is unhealthy, as we don't get an error on the read, do we really care if it is unhealthy? Perhaps checkContainerHealthy could be moved into the exception block to log only if there problems?
Another question is - does ContainerData.getState()
even need to be synchronized? As I understand it, updating a reference in Java is atomic - and the state is just a reference which can be changed by updateState()
- perhaps it is fine for getState() to not be synchronized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sodonnel Thanks for the advice.
Here is the jstack which readLock
in WAITING state long time: dn.chunkreader-waiting.jstack.txt
Removing this read lock can solve the problem directly. From this point, it is effective.
All this method is doing is logging a message if the container is unhealthy - do we need to do that?
I did find some this unhealthy logs, but it did not affect reading. It seems that some optimization can indeed be done here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @sodonnel that we can actually remove the checkContainerIsHealthy call, as it just log one message, which is not much useful. @whbing , could you help to update the patch?
Regarding the synchronized functions in ContainerData, all of them are simple actions/checks against Container state. I feel the performance overhead is not that much. Need some data to judge that. Anyway, it's a new and another topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized after I wrote the above, the the synchronization on the getter is probably not an issue. The problem root cause is the write lock being held for a long time by the delete thread. It would still be best to avoid the synchronization too if we can just remove the call entirely. The log message doesn't add much value compare to the cost of it via serialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChenSammi @sodonnel Updated commit with just removing checkContainerIsHealthy
method and the call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @whbing for working on this patch.. Was having a quick look at the PR. And somewhat agree with @sodonnel org.apache.hadoop.ozone.container.common.impl.ContainerData#setState is synchronized and gets called from 2 callers inside ContainerData
-> quasiCloseContainer
and closeContainer
and they are being called inside writeLock on container, so container state change is already inside writelock of container object. So agree that taking kvContainer.readLock
inside checkContainerIsHealthy
may be overwhelming as it is trying to wait for lock on whole container calling for getBlock or GetCommitedBlockLength calls, but replacing with kvContainer.getContainerData().getState()
may also not be a right solution as this getState
is fully synchronized method. We need to see if this method is really needed to be synchronized
. I feel it is not need to be synchronized
.
Not sure to what extent it affects performance. At least in practice, we haven't encountered any bottlenecks caused by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1.
Thanks @whbing for reporting the issue and also the fix, @sodonnel @devmadhuu for the review.
…y to improve read performance (apache#6127) (cherry picked from commit 528fc92) Change-Id: Ie4438d8e95a1148e7050f16201b4c890ddbee400
… read performance (apache#6127) (cherry picked from commit 528fc92)
… read performance (apache#6127) (cherry picked from commit 528fc92)
… read performance (apache#6127) (cherry picked from commit 528fc92)
What changes were proposed in this pull request?
As described in HDDS-10146, all of the ChunkReader threads are blocked in the
KeyValueHandler#checkContainerIsHealthy
method, waiting for theKeyValueContainer#readLock
.Remove readLock from KeyValueHandler.checkContainerIsHealthy to improve performance
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10246
How was this patch tested?